1,726 research outputs found

    Revisiting Precision and Recall Definition for Generative Model Evaluation

    Full text link
    In this article we revisit the definition of Precision-Recall (PR) curves for generative models proposed by Sajjadi et al. (arXiv:1806.00035). Rather than providing a scalar for generative quality, PR curves distinguish mode-collapse (poor recall) and bad quality (poor precision). We first generalize their formulation to arbitrary measures, hence removing any restriction to finite support. We also expose a bridge between PR curves and type I and type II error rates of likelihood ratio classifiers on the task of discriminating between samples of the two distributions. Building upon this new perspective, we propose a novel algorithm to approximate precision-recall curves, that shares some interesting methodological properties with the hypothesis testing technique from Lopez-Paz et al (arXiv:1610.06545). We demonstrate the interest of the proposed formulation over the original approach on controlled multi-modal datasets.Comment: ICML 201

    The Column Density Distribution Function at z=0 from HI Selected Galaxies

    Full text link
    We have measured the column density distribution function, f(N), at z=0 using 21-cm HI emission from galaxies selected from a blind HI survey. f(N) is found to be smaller and flatter at z=0 than indicated by high-redshift measurements of Damped Lyman-alpha (DLA) systems, consistent with the predictions of hierarchical galaxy formation. The derived DLA number density per unit redshift, dn/dz =0.058, is in moderate agreement with values calculated from low-redshift QSO absorption line studies. We use two different methods to determine the types of galaxies which contribute most to the DLA cross-section: comparing the power law slope of f(N) to theoretical predictions and analysing contributions to dn/dz. We find that comparison of the power law slope cannot rule out spiral discs as the dominant galaxy type responsible for DLA systems. Analysis of dn/dz however, is much more discriminating. We find that galaxies with log M_HI < 9.0 make up 34% of dn/dz; Irregular and Magellanic types contribute 25%; galaxies with surface brightness > 24 mag arcsec^{-2} account for 22% and sub-L* galaxies contribute 45% to dn/dz. We conclude that a large range of galaxy types give rise to DLA systems, not just large spiral galaxies as previously speculated.Comment: 13 pages, low resolution figures in the appendix, MNRAS accepte

    Local Column Density Distribution Function from HI selected galaxies

    Get PDF
    The cross-section of sky occupied by a particular neutral hydrogen column density provides insight into the nature of Lyman-alpha absorption systems. We have measured this column density distribution at z=0 using 21-cm HI emission from a blind survey. A subsample of HI Parkes All Sky Survey (HIPASS) galaxies have been imaged with the Australia Telescope Compact Array (ATCA). The contribution of low HI mass galaxies 10^7.5 to 10^8 M_solar is compared to that of M_star (10^10 to 10^10.5 M_solar) galaxies. We find that the column density distribution function is dominated by low HI mass galaxies with column densities in the range 3x10^18 to 2x10^20 cm^-2. This result is not intuitively obvious. M_star galaxies may contain the bulk of the HI gas, but the cross-section presented by low HI mass galaxies 10^7.5 to 10^8 M_solar is greater at moderate column densities. This result implies that moderate column density Lyman-alpha absorption systems may be caused by a range of galaxy types and not just large spiral galaxies as originally thought.Comment: 5 pages, including 1 figure. To appear in "Extragalactic Gas at Low Redshift" (ASP Conf. Series, Weymann Conf.

    Evolution of damped Lyman alpha kinematics and the effect of spatial resolution on 21-cm measurements

    Full text link
    We have investigated the effect of spatial resolution on determining pencil-beam like velocity widths and column densities in galaxies. Three 21-cm datasets are used, the HIPASS galaxy catalogue, a subset of HIPASS galaxies with ATCA maps and a high-resolution image of the LMC. Velocity widths measured from 21-cm emission in local galaxies are compared with those measured in intermediate redshift Damped Lyman alpha (DLA) absorbers. We conclude that spatial resolution has a severe effect on measuring pencil-beam like velocity widths in galaxies. Spatial smoothing by a factor of 240 is shown to increase the median velocity width by a factor of two. Thus any difference between velocity widths measured from global profiles or low spatial resolution 21-cm maps at z=0 and DLAs at z>1 cannot unambiguously be attributed to galaxy evolution. The effect on column density measurements is less severe and the values of dN/dz from local low-resolution 21-cm measurements are expected to be overestimated by only ~10 per cent.Comment: 5 pages, 6 figures, accepted for publication in MNRAS letter

    Generating Private Data Surrogates for Vision Related Tasks

    Get PDF
    International audienceWith the widespread application of deep networks in industry, membership inference attacks, i.e. the ability to discern training data from a model, become more and more problematic for data privacy. Recent work suggests that generative networks may be robust against membership attacks. In this work, we build on this observation, offering a general-purpose solution to the membership privacy problem. As the primary contribution, we demonstrate how to construct surrogate datasets, using images from GAN generators, labelled with a classifier trained on the private dataset. Next, we show this surrogate data can further be used for a variety of downstream tasks (here classification and regression), while being resistant to membership attacks. We study a variety of different GANs proposed in the literature, concluding that higher quality GANs result in better surrogate data with respect to the task at hand

    On the Theoretical Equivalence of Several Trade-Off Curves Assessing Statistical Proximity

    Full text link
    The recent advent of powerful generative models has triggered the renewed development of quantitative measures to assess the proximity of two probability distributions. As the scalar Frechet inception distance remains popular, several methods have explored computing entire curves, which reveal the trade-off between the fidelity and variability of the first distribution with respect to the second one. Several of such variants have been proposed independently and while intuitively similar, their relationship has not yet been made explicit. In an effort to make the emerging picture of generative evaluation more clear, we propose a unification of four curves known respectively as: the precision-recall (PR) curve, the Lorenz curve, the receiver operating characteristic (ROC) curve and a special case of R\'enyi divergence frontiers. In addition, we discuss possible links between PR / Lorenz curves with the derivation of domain adaptation bounds.Comment: 10 pages, 3 figure

    Detecting Overfitting of Deep Generative Networks via Latent Recovery

    Full text link
    State of the art deep generative networks are capable of producing images with such incredible realism that they can be suspected of memorizing training images. It is why it is not uncommon to include visualizations of training set nearest neighbors, to suggest generated images are not simply memorized. We demonstrate this is not sufficient and motivates the need to study memorization/overfitting of deep generators with more scrutiny. This paper addresses this question by i) showing how simple losses are highly effective at reconstructing images for deep generators ii) analyzing the statistics of reconstruction errors when reconstructing training and validation images, which is the standard way to analyze overfitting in machine learning. Using this methodology, this paper shows that overfitting is not detectable in the pure GAN models proposed in the literature, in contrast with those using hybrid adversarial losses, which are amongst the most widely applied generative methods. The paper also shows that standard GAN evaluation metrics fail to capture memorization for some deep generators. Finally, the paper also shows how off-the-shelf GAN generators can be successfully applied to face inpainting and face super-resolution using the proposed reconstruction method, without hybrid adversarial losses
    • …
    corecore